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E-MAIL SYSTEM PROVIDING FILTERING METHODOLOGY ON A PER-DOMAIN 

BASIS 

5 COPYRIGHT NOTICE 

A portion of the disclosure of this patent document contains material that is subject to 
copyright protection. The copyright owner has no objection to the facsimile reproduction by 
anyone of the patent document or the patent disclosure as it appears in the Patent and 
Trademark Office patent file or records, but otherwise reserves all copyright rights 
10 whatsoever. 

BACKGROUND OF THE INVENTION 

L Field of the Invention 

The present invention relates generally to electronic mail (e-mail) systems and, more 
particularly, to improved methodology for filtering e-mail messages sent from various host 
15 domains. 

2. Description of the Background Art 

Today, electronic mail or "e-mail" is a pervasive, if not the most predominant, form 
of electronic communication. Fig. 1 A illustrates the basic architecture of a typical electronic 

20 mail system 10. At a high level, the system includes a mail server connected over a network 
to various e-mail "clients," that is, the individual users of the system. More specifically, the 
system 10 includes one or more clients 1 1 connected over a network to at least one SMTP 
(Simple Mail Transfer Protocol) server or "Mail Transport Agent" (MTA) 12a for routing e- 
mail. Users write, send, and read e-mail via Mail User Agents (MUA), such as Microsoft 

25 Outlook™^ present at each client (computer). To send e-mail, an MUA connects to an MTA 
which receives the e-mail and routes it to another MTA. An intermediary MTA might 
forward the e-mail to yet another MTA until the e-mail reaches the destination system, where 
the e-mail is stored in a mailbox accessible by the recipient. 
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A typical e-mail delivery process is as follows. In the following scenario, Larry sends 
e-mail to Martha at her e-mail address: martha@example,org, Martha's Internet Service 
Provider (ISP) uses an MTA, such as provided by Sendmail® for NT, available from 
Sendmail, Inc. of Emery ville, CA. (With a lower case "s," "sendmail" refers to Sendmail' s 
5 MTA, which is one component of the Sendmail® Switch product line.) 

1 . Larry composes the message and chooses Send in Microsoft Outlook Express (a "Mail 
User Agent" or MUA). The e-mail message itself specifies one or more intended recipients 
(i.e., destination e-mail addresses), a subject heading, and a message body; optionally, the 

10 message may specify accompanying attachments. 

2. Microsoft Outlook Express queries a DNS server for the IP address of the local mail server 
running sendmail. The DNS server translates the domain name into an IP address, e.g., 

10. L 1.1, of the local mail server. 

15 

3. Microsoft Outlook Express opens a TCP/IP connection to the local mail server running 
sendmail. The message is transmitted to a second sendmail server using the SMTP protocol 
(e.g., as defined in RFC 821). 

20 4. sendmail queries a DNS server for the MX record of the destination domain, i.e., 

example.org. The DNS server returns a hostname (e.g., mail.example.org) for the destination 
domain, sendmail queries a DNS server for the A record of mail.example.org (i.e., the IP 
address). The DNS server returns an IP address of, for example, 127.118.10.3. 

25 5. sendmail opens a TCP/IP connection to the remote mail server providing e-mail service for 
example.org, which is also running sendmail. The message is transmitted to the remote 
sendmail mail server using the SMTP protocol. 

6. sendmail delivers Larry's message for Martha to the local delivery agent. It appends the 
30 message to Martha's mailbox. By default, the message is stored in (e.g., using a sample file 

path on a UNIX system): 

/var/spool/mail/martha. 

7. Martha has her computer dial into her ISP. 

35 

8. Martha chooses "Check Mail" in Eudora. 

9. Eudora opens a P0P3 (Post Office Protocol version 3, defined in RFC 1725) connection 
with the P0P3 (incoming mail) server. Eudora downloads Martha's new messages, 

40 including the message from Larry. 

10. Martha reads Larry's message. 
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The MTA, which is responsible for queuing up messages and arranging for their 
distribution, is the workhorse component of electronic mail systems. The MTA "listens" for 
incoming e-mail messages on the SMTP port, which is generally port 25. When an e-mail 
message is detected, it handles the message according to configuration settings, that is, the 
5 settings chosen by the system administrator, in accordance with relevant standards such as 
Request For Comment documents (RFCs). Typically, the mail server or MTA must 
temporarily store incoming and outgoing messages in a queue, the "mail queue." Actual 
queue size is highly dependent on one's system resources and daily volumes. 

MTAs, such as the commercially-available Sendmail® MTA, perform three key mail 
10 transport functions: 

1. Route mail across the Internet to an MTA serving a different network or "domain" 
(since many domains can and do exist in a single network); 

2. Relay mail to another MTA (e.g., MTA 12b on Fig. lA) on a different subnet within 
the same network; 

15 3. Transfer mail from one host or server to another on the same network subnet. 

To perform these functions, an MTA accepts messages from other MTAs or MUAs, parses 
addresses to identify recipients and domains, resolves aliases, fixes addressing problems, 
copies mail into a queue on its hard disk, tries to process long and hard-to-pass messages, and 

20 notifies the sender when a particular task cannot be successfully completed. The MTA does 
not store messages (apart from its queue) or help users access messages. It relies on other 
mail system components, such as message delivery agents, message stores and mail user 
agents (MUAs), to perform these tasks. These additional components can belong to any 
number of commercial or free products (e.g., POPS or IMAP servers, Microsoft Exchange, 

25 IBM Lotus Notes, Netscape, cc:Mail servers, or the like). Because of its central role in e- 
mail systems, however, the MTA often serves as the "glue" that makes everything appear to 
work together seamlessly. 

The overall process may be summarized as follows. E-mail is routed via SMTP 
servers, the so-called "Mail Transport Agents" (MTA). Users write, send, and read e-mail 

30 via Mail User Agents (MUA). To send e-mail, an MUA connects to an MTA, which 

receives the e-mail and routes it to another MTA. An intermediary MTA might forward the 
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e-mail to yet another MTA until the e-mail reaches the destination system, where the e-mail 
is stored in a mailbox accessible by the recipient. 

For further description of e-mail systems, see e.g., Sendmail® for NT User Guide, 
Part Number DOC-SMN-300-WNT-MAN-0999, available from Sendmail, Inc. of 
5 Emeryville, CA, the disclosure of which is hereby incorporated by reference. Further 
description of the basic architecture and operation of e-mail systems is available in the 
technical and trade literature, see e.g., the following RFC (Request For Comments) 
documents: 

10 RFC 821 Simple Mail Transfer Protocol (SMTP) 

RFC 822 Standard for the Format of ARPA Intemet Text Messages 

RFC 974 Mail Routing and the Domain System 

RFC 937, RFC 1081 Post Office Protocol version 3 (P0P3) 

RFC 1 123 Requirements for Intemet Hosts ~ AppUcation and Support 

15 RFC 1725 Post Office Protocol version 3 (P0P3) 

RFC 2033 Local Mail Transfer Protocol (LMTP) 

RFC 2060, RFC 2061 Intemet Message Access Protocol (IMAP) 
RFC 2246 The TLS Protocol, version 1 .0 

RFC 2487 SMTP Service Extension for Secure SMTP over TLS 

20 

RFCs are numbered Intemet informational documents and standards widely followed by 
conamercial software and freeware in the Intemet and UNIX communities. The RFCs are 
unusual in that they are floated by technical experts acting on their own initiative and 
reviewed by the Intemet at large, rather than formally promulgated through an institution 

25 such as ANSI. For this reason, they remain known as RFCs even once they are adopted as 
standards. The above-listed RFC documents are currently available via the Intemet (e.g., at 
http://wwwJetf.org/rfc), the disclosures of which are hereby incorporated by reference. 

During operation of the Sendmail MTA, a listening process operates to detect 
requests for new connections. When a request for a new connection arrives, Sendmail makes 

30 a new copy or instance of itself through a "forking" operation. The "forked" or new process 
deals with the new connection exclusively. Thus, at the conclusion of the forking operation 
two processes exist: a listening process to detect requests for new connections, and a forked 
process which was created for the purpose of exclusively handling a particular new 
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connection. This forking operation may be repeated to spawn other child processes, each one 
for exclusively handling a particular new connection. 

Each child process that is created has no knowledge of the other child processes (i.e., 
no memory access to the data structures of the other child processes). This lack of 
5 knowledge of other child processes leads to system vulnerability. For example, the child 
processes cannot detect that the system is being "slanmied" by many connections from a 
particular host. In this scenario, only the parent process (i.e., the process spawning the child 
processes) would be able to know (i.e., be able to maintain information) about the source of 
all of the connections to the system. Once a forking operation occurs, there is no interprocess 

10 communication between the child processes that would allow these processes to detect the 
foregoing slanmiing scenario. 

Fig, IB illustrates this problem in further detail. In the figure, Parent process 101 
contains a variable A having an initial value of 1 (i.e., at Time TO). Suppose, at Time Tl, a 
child process. Child 1 (shown at 105), is created using the forking operation. At the instance 

15 that Child 1 is created, the entire memory space of the Parent is copied to Child 1. Thus, at 
Time Tl, Child 1 also contains a variable A having a value of 1, as shown. Suppose, at Time 
T2, that the Parent process 101 changes the value of the variable A to 2. The value of the 
variable A in Child 1 (105) is not changed, as the two processes are now separate. 
Proceeding with the example, at Time T3, Child 2 (shown at 107) is created, again using the 

20 forking operation. As before, the newly created Child process receives a copy of the then- 
current copy of the memory space of the Parent. Thus, Child 2 (107) contains a variable A 
having a value of 2, as shown. Note in particular that the first Child process, Child 1, has no 
knowledge that the current value of A (whatever significance it may have) in the parent is 
now 2. The second Child process, Child 2, on the other hand, does know that the current 

25 value of A in the parent is 2, by virtue of the fact that Child 2 was created at a later point in 
time (i.e.. Time T3). 

Without the establishment of interprocess conmiunication (IPC) between the 
processes (e.g., shared memory, UNIX-style pipes, or the like), there is no knowledge shared 
among the processes about the current state of each process' data. In the context of an e-mail 

30 system, if there are several connections coming from a single host (for which the Parent 
process has created several Child processes), there is no shared knowledge among the 
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processes to indicate which process is handling what connection. Thus, the system cannot 
readily determine that, for example, it may be handling ten incoming connections for a single 
domain (e.g., AOL.com). At the same time, however, the system may in fact benefit from 
identifying that scenario so that the system can moderate usage of its resources by various 
domains. For instance, if a particular domain is hogging the resources of the system (e.g., 
"slamming" the system with a multitude of e-mail messages), the system would want to 
identify that situation and take corrective action. 

To date, efforts at addressing the foregoing problem have not provided a domain- 
specific solution and, therefore, have been sub-optimal For example, existing e-mail 
systems may be configured to Umit the number of child processes created at a given instance 
in time. That approach, however, simply provides a general limit on system resources. The 
approach does not address over-utilization or abuse of system resources by a particular 
domain. 

Given the ever-increasing reliance on e-mail as a preferred medium for business and 
personal communication, there is much interest in improving the performance and reliabihty 
of e-mail systems. Accordingly, there is a need for an e-mail system that incorporates 
methodology for moderating usage of its resources on a per-domain basis. The present 
invention fulfills this and other needs. 



Page 7 of 43 



GLOSSARY 

fork: Refers to UNIX (or UNIX-like) /6>rA:fj function is used to create a new process from an 
existing process. The new process is called the child process, and the existing process is 
called the parent. The child process inherits from the parent: process credentials 
5 (real/effective/saved UIDs and GIDs) environment, stack, memory, open file descriptors, 
close-on-exec flags, signal handhng settings, scheduler class, process group ID, session ID, 
current working directory, root directory, file mode creation mask, resource limits, and 
controlling terminal. Unique to the child (i.e., not inherited): process ED, different parent 
process ID, own copy of file descriptors and directory streams, and own process, text, data 
1 0 and other memory locks. 

MTA: Short for Mail Transport Agent (also sometimes referred to as Mail Transfer Agent, 
Message Transport Agent, or Message Transfer Agent); that part of a mail deUvery system 
that routes between its origin (the sending user and the program used to create mail) and its 
destination (the agent which writes mail into mailboxes). 

15 SMTP: Short for Simple Mail Transfer Protocol, a protocol for sending e-mail messages 
between servers. Most e-mail systems that send mail over the Internet use SMTP to send 
messages from one server to another; the messages can then be retrieved with an e-mail 
client using either POP or IMAP. Further description of SMTP may be found in RFC 821 
(and subsequent RFC 2821), the disclosure of which is hereby incorporated by reference. 

20 Copies of the foregoing RFCs may be found on the Internet at http://www.faqs.org/rfcs, 

spam/spamming: Spam is unsoUcited junk email. Spamming is the act of sending spam 
(usually in large quantities). 

TCP: Stands for Transmission Control Protocol. TCP is one of the main protocols in 
TCP/IP networks. Whereas the IP protocol deals only with packets, TCP enables two hosts 
25 to establish a connection and exchange streams of data. TCP guarantees delivery of data and 
also guarantees that packets will be delivered in the same order in which they were sent. For 
an introduction to TCP, see, e.g., RFC 793, the disclosure of which is hereby incorporated by 
reference. 

TCP/IP: Stands for Transmission Control Protocol/Internet Protocol, the suite of 
30 communications protocols used to connect hosts on the Internet. TCP/IP uses several 
protocols, the two main ones being TCP and IP. TCP/IP is built into the UNIX operating 
system and is used by the Internet, making it the de facto standard for transmitting data over 
networks. For an introduction to TCP/IP, see e.g., RFC 1180: A TCP/IP Tutorial, the 
disclosure of which is hereby incorporated by reference. A copy of RFC 1 180 is currently 
3 5 available at ftp: //ftp. isL edu/in-notes/rfcl 1 80. txt. 
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SUMMARY OF THE INVENTION 

In accordance with the present invention, operation of an e-mail system is modified to 
incorporate a flow control filter (service). During processing of incoming e-mail, each child 
MTA process (that is created to handle a particular new connection) connects to the flow 
5 control filter service, so that it can interact with the service during arrival of a message. This 
interaction provides a complete description of the incoming client, including IP address and 
host name, as well as the complete SMTP interaction, including HELO (i.e., initial "hello" 
handshake), MAIL FROM (i.e., sender information), RCPT TO (i.e., recipient list), and 
DATA (i.e., entire message body). Since the flow control filter service monitors all children 
C 10 processes, it attains a global view of traffic flowing through the system. By virtue of its 
global view, the flow control filter service can track information on a per domain basis, 
including total volume of e-mail received from a particular domain over a given period of 
yj time. Examples of other metrics that may be tracked include total connections and total 

^"^ senders (count) encountered for a particular domain over a given period of time. Other 

y 15 examples include total number of different recipients, total number of envelopes, and total 
yj aggregate volume of mail. Since the knowledge lost by the forking process is captured by 

the flow control filter service, the service is in a position to enforce policy-based rules, 
^ including placing restrictions on child processes, based on the per-domain tallies 

encountered, 

20 The overall methodology of operation may be summarized as follows. The following 

method steps occur in the context of an incoming message that is being processed by the e- 
mail system (i.e., MTA forking has already occurred) and now the system is ready to evoke 
the services of the flow control filter of the present invention. Invocation of the flow control 
filter begins with the MTA (i.e., a child MTA of the original (parent) listener) connecting to 

25 the flow control filter (e.g., using Sendmail Milter protocol); the filter accepts the connection. 
The MTA and the filter perform a handshake sequence, including feature and parameter 
negotiation. At the conclusion of the handshake sequence, a new thread is created (i.e., in the 
flow control engine) for processing the new connection/message. Now, the MTA passes to 
the filter the corresponding connection information (e.g., IP address and host name) of the 

30 sending MTA. Based on the connection information, the filter may look up matching class 
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data from the configuration file. In the event that no matching class data is found, the filter 
will assume unrestricted access for the host and therefore will accept the connection and 
message. In that case, the flow control engine thread handling the connection may terminate, 
as there is no further filtering work to be done for this incoming connection and message; the 
5 MTA proceeds normally with no further interaction with the filter. Otherwise, the method 
proceeds to the following filtering steps. The method tests whether class limits have been 
reached. In the event that limits have not been reached (i.e., true), the filter instructs the 
MTA to continue and increments the current connection count. Otherwise (i.e., false), the 
method terminates with the filter rejecting the connection and returning an administrator- 

10 defined error code. In the event that the process did not terminate, the MTA reports the 
sender information to the filter; this occurs in response to the MAIL FROM SMTP phase. 

The method notes the sender (i.e., who is the sender) in the class. The administrator- 
defined class may include, for example, a sender-based parameter indicating that the filter 
should note the number of unique senders that have arrived in a given timeframe for this 

15 particular host (of the class). In a manner similar to above, the method tests whether class' 
sender limits have been reached. In the event that limits have not been reached (i.e., true), 
the filter instructs the MTA to continue and increments the current unique sender totals. 
Otherwise, the method terminates with the filter rejecting the message (returning any 
administrator-defined error code). In the event that the filtering process has not terminated 

20 based on sender information, the method proceeds to test recipient (RCPT TO) information. 
The configuration file allows the administrator to define a class that Umits the number of 
unique recipients received for that class, over any given time span. As a given message may 
have multiple recipients, the step repeats for each recipient (information) of the message. As 
before, if specified limits are exceeded, the method terminates with the filter rejecting the 

25 message (returning any administrator-defined error code). Otherwise, the method updates 
the totals and proceeds. 

The MTA reports the message body, which may be transmitted as one or more 
blocks. The method updates a running total of message size. This information is used to 
determine the aggregate total of bytes received from a given source over a period of time. 

30 The MTA reports end of message for the current incoming message. The method compares 
the message size against class limits specified in the configuration file. Again as before, if 



Page 10 of 43 



specified limits are exceeded, the method terminates with the filter rejecting the message 
(returning any administrator-defined error code). Otherwise, the incoming message has 
passed all filters and is accepted. Now, the method may repeat for other incoming messages. 
This approach may be easily scaled, for application on a site-wide basis. In that 
5 instance, the flow control filter service monitors the children processes for a number of e- 
mail servers at a given site. In such a configuration, the flow control filter service would 
apply policy on a global (site) basis, instead of on a per server basis. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 Fig. 1 A illustrates the basic architecture of a typical electronic mail system. 

Fig. IB illustrates the problem encountered with present-day use of forking 
operations to create child MTA processes. 

Fig. 2 is a block diagram of a general-purpose computer system (e.g., IBM- 
compatible PC or server computer) in which computer-implemented methods of the present 
1 5 invention may be embodied. 

Fig. 3 is a block diagram of a computer software system for directing the operation of 
the computer system of Fig. 2. 

Fig. 4 is a high-level block diagram illustrating an e-mail environment that includes a 
flow control filter of the present invention. 
20 Fig. 5 is a block diagram illustrating the flow control filter of the present invention in 

further detail. 

Figs. 6A-D comprise a single flowchart providing a summary of a flow control 
filtering methodology of the present invention. 
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DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT 

The following description will focus on the presently-preferred embodiment of the 
present invention, which is implemented in server-based software operating in an Intemet- 
connected environment running under a server operating system, such as the Microsoft® 
5 Windows NT running on an IBM-compatible server computer. The present invention, 
however, is not limited to any particular one application or any particular environment. 
Instead, those skilled in the art will find that the system and methods of the present invention 
may be advantageously embodied on a variety of different platforms, including Macintosh, 
Linux, BeOS, Solaris, UNIX, NextStep, FreeBSD, and the like. Therefore, the description of 
10 the exemplary embodiments that follows is for purposes of illustration and not limitation. 

L Computer-based implementation 

A. Basic system hardware (e.gv for desktop and server computers) 

The present invention may be implemented on a conventional or general-purpose 
computer system, such as an IBM-compatible server computer operating under a server 

15 operating system. Fig. 2 is a very general block diagram of an IBM-compatible system 200. 
As shown, system 200 comprises a central processing unit(s) (CPU) 201 coupled to a 
random-access memory (RAM) 202, a read-only memory (ROM) 203, a keyboard 206, a 
printer 207, a pointing device 208, a display or video adapter 204 connected to a display 
device 205, a removable (mass) storage device 215 (e.g., floppy disk), a fixed (mass) storage 

20 device 216 (e.g., hard disk), a communication port(s) or interface(s) 210, a modem 212, and a 
network interface card (NIC) or controller 211 (e.g., Ethemet). Although not shown 
separately, a real-time system clock is included with the system 200, in a conventional 
manner. 

CPU 201 comprises a processor of the Intel Pentium® family of microprocessors. 
25 However, any other suitable microprocessor or ndcrocomputer may be utilized for 

implementing the present invention. The CPU 201 communicates with other components of 
the system via a bi-directional system bus (including any necessary input and output (FO) 
controller circuitry and other "glue" logic). The bus, which includes address lines for 
addressing system memory, provides data transfer between and among the various 
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components. Description of Pentium-class microprocessors and their instruction set, bus 
architecture, and control lines is available from Intel Corporation of Santa Clara, CA. 
Random-access memory 202 serves as the working memory for the CPU 20 L In a typical 
configuration, RAM of sixteen megabytes or more is employed. More or less memory may 
5 be used without departing from the scope of the present invention. The read-only memory 
(ROM) 203 contains the basic input output system code (BIOS) ~ a set of low-level routines 
in the ROM that application programs and the operating systems can use to interact with the 
hardware, including reading characters from the keyboard, outputting characters to printers, 
and so forth. 

10 Mass storage devices 215and 216 provide persistent storage on fixed and removable 

media, such as magnetic, optical or magnetic-optical storage systems, or flash memory, or 
any other available mass storage technology. The mass storage may be shared on a network, 
or it may be a dedicated mass storage. As shown in Fig. 2, fixed storage 216 stores a body of 
program and data for directing operation of the computer system, including an operating 

15 system, user application programs, driver and other support files, as well as other data files of 
all sorts. Typically, the fixed storage 216 serves as the main hard disk for the system. 

In basic operation, program logic (including that which implements methodology of 
the present invention described below) is loaded from the removable storage device 215 or 
fixed storage 216 into the main (RAM) memory 202, for execution by the CPU 201. During 

20 operation of the program logic, the system 200 accepts user input from a keyboard 206 and 
pointing device 208, as well as speech-based input from a voice recognition system (not 
shown). The keyboard 206 permits selection of application programs, entry of keyboard- 
based input or data, and selection and manipulation of individual data objects displayed on 
the display screen 205. Likewise, the pointing device 208, such as a mouse, track ball, pen 

25 device, or the like, permits selection and manipulation of objects on the display screen. In 
this manner, these input devices support manual user input for any process running on the 
system. 

The computer system displays text and/or graphic images and other data on the 
display device 205. The video adapter 204, which is interposed between the display 205 and 
30 the CPU 201, drives the display device 205. The video adapter 204, which includes video 
memory accessible to the CPU, provides circuitry that converts pixel data stored in the video 
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memory to a raster signal suitable for use by a cathode ray tube (CRT) raster or liquid crystal 
display (LCD) monitor. A hard copy of the displayed information, or other information 
within the system 200, may be obtained from the printer 207, or other output device. Printer 
207 may include, for instance, an HP LaserJet® printer (available from Hewlett-Packard of 
5 Palo Alto, CA), for creating hard copy images of output of the system. 

The system itself communicates with other devices (e.g., other computers) via the 
network interface card (NIC) 21 1 connected to a network (e.g., Ethernet network), and/or 
modem 212 (e.g., 56K baud, ISDN, DSL, or cable modem), examples of which are available 
from 3Com of Santa Clara, CA. The system 200 may also conmiunicate with local 
10 occasionally-connected devices (e.g., serial cable-Unked devices) via the communication 
("conmi") interface 210, which may include a RS-232 serial port, a Universal Serial Bus 
(USB) interface, or the like. Other devices may be connected locally via the comm interface 
210. 

IBM-compatible personal computers and server computers are available from a 
15 variety of vendors. Representative vendors include Dell Computers of Round Rock, TX, 
Compaq Computers of Houston, TX, and IBM of Armonk, NY. Other suitable computers 
include Sun Solaris workstations, which are available from Sun Microsystems of Mountain 
View, CA. 

The above-described system 200 of Fig. 2 is presented for purposes of illustrating the 
20 basic hardware underlying desktop (e.g., e-mail client) and server computer (e.g., SMTP 
server) components that may be employed in the system of the present invention. Those 
skilled in the art will appreciate that the present invention may be implemented in any type of 
computer system or processing environment capable of supporting the methodologies of the 
present invention presented in detail below. 

25 B. Basic system software 

Illustrated in Fig. 3, a computer software system 300 is provided for directing the 
operation of the computer system 200. Software system 300, which is stored in system 
memory (RAM) 202 and on fixed storage (e.g., hard disk) 216, includes a kernel or operating 
system (OS) 310. The OS 310 manages low-level aspects of computer operation, including 
30 managing execution of processes, memory allocation, file input and output (I/O), and device 
I/O. One or more application programs, such as client application software or "programs" 
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301 (e.g., 301a, 301b, 301c) may be "loaded" (i.e., transferred from fixed storage 216 into 
memory 202) for execution by the software system 200. For instance, when the system 200 
is employed to control a desktop machine, appUcation software 301 includes client e-mail 
software 305 (e.g., Microsoft Outlook, available from Microsoft Corporation of Redmond, 
5 WA), as shown. When the system 200 is employed to control a server machine, on the other 
hand, appUcation software 301 includes mail server software (e.g., Sendmail® for NT). 

System 300 includes a graphical user interface (GUI) 315, for receiving user 
conmiands and data in a graphical (e.g., "point-and-click") fashion. These inputs, in turn, 
may be acted upon by the system 200 in accordance with instructions from operating system 

10 310, and/or cHent application module(s) 301. The GUI 315 also serves to display the results 
of operation from the OS 310 and apphcation(s) 301, whereupon the user may supply 
additional inputs or terminate the session. Typically, the OS 310 operates in conjunction 
with device drivers 320 (e.g., "Winsock" driver) and the system BIOS microcode 330 (i.e., 
ROM-based microcode), particularly when interfacing with peripheral devices. OS 310 can 

15 be provided by a conventional operating system, such as Microsoft® Windows 9x, by 

Microsoft® Windows NT, by Microsoft® Windows 2000, or by Microsoft® WindowsXP, 
all available from Microsoft Corporation of Redmond, WA. Alternatively, OS 310 can also 
be an alternative operating system, such as IBM OS/2 (available from IBM of Armonk, NY) 
or Linux OS (available from several vendors, including the Red Hat distribution of Linux 

20 from Red Hat, Inc. of Durham, NC). 

The above-described computer hardware and software are presented for purposes of 
illustrating the basic underlying desktop and server computer components that may be 
employed for implementing the present invention. For purposes of discussion, the following 
description will present examples in which it will be assumed that there exists an MTA (e.g., 

25 "sendmail" - the Sendmail MTA) that communicates with one or more other MTAs (e.g., 
remote MTAs), as well as other relevant systems (e.g., a flow control filter of the present 
invention, described below). The present invention, however, is not limited to any particular 
environment or hardware configuration. In particular, an MTA distinction is not necessary to 
the invention, but is used to provide a framework for discussion. Instead, the present 

30 invention may be implemented in any type of system architecture or processing environment 
capable of supporting the methodologies of the present invention presented in detail below. 
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II, Filtering on a per-domain basis 
A. Introduction 

1. Basic SMTP 

Most e-mail systems that send mail over the Internet use SMTP to send messages 
5 from one server to another. SMTP transports a mail object, which contains an envelope and 
content. The SMTP envelope is sent as a series of SMTP protocol units or commands. It 
includes an originator address (to which error reports should be directed), one or more 
recipient addresses, and optional protocol extension material. The SMTP content is sent in 
the SMTP DATA protocol unit and has two parts: the headers and the body. If the content 

10 conforms to other contemporary standards, the headers form a collection of field/value pairs 
structured as in the message format specification; the body, if structured, is defined according 
to MIME. The SMTP design is based on the following model of communication. In 
response to a user mail request, a sender-SMTP establishes a two-way transmission channel 
to a receiver-SMTP. The receiver-SMTP may be either the ultimate destination or an 

15 intermediate. SMTP commands are generated by the sender-SMTP and sent to a receiver- 
SMTP. SMTP replies are sent from the receiver-SMTP to the sender-SMTP in response to 
the commands. Once the transmission channel is established, the SMTP-sender sends a 
MAIL FROM command indicating the sender of the mail If the SMTP-receiver can accept 
mail, it responds with an OK reply. The SMTP-sender then sends a RCPT TO conomand 

20 identifying a recipient of the mail. If the SMTP-receiver can accept mail for that recipient, it 
responds with an OK reply; if not, it responds with a reply rejecting that recipient (but not the 
whole mail transaction). The SMTP-sender and SMTP-receiver may negotiate several 
recipients. When the recipients have been negotiated, the SMTP-sender sends the mail data 
(DATA command), terminating with a special sequence. If the SMTP-receiver successfully 

25 processes the mail data it responds with an OK reply. Further description of SMTP may be 
found in RFC 821 (and subsequent RFC 2821), the disclosure of which is hereby 
incorporated by reference. 
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2. Sendmail Milter interface 

The present invention is implemented as a server-side flow control filter service that 
interacts with a mail server during arrival of an incoming message. This interaction may be 
achieved through use of Sendmail "Milter" interface. The Sendmail Mail Filter API (Milter) 
5 provides an interface for third-party software to validate and modify messages as they pass 
through the mail transport system. Filters can process messages' connection (IP) 
information, envelope protocol elements, message headers, and/or message body contents, 
and modify a message's recipients, headers, and body. Using SendmaiFs corresponding 
configuration file, one can specify which filters are to be applied, and in what order, allowing 

10 an administrator to combine multiple independently-developed filters. Thus in this manner, 
the Milter plug-in architecture allows a developer to, in effect, plug into the e-mail delivery 
system for inserting custom subroutines or other processing. Accordingly, in the preferred 
embodiment, the flow control filter employs the Sendmail Milter interface for accessing 
internal phases and data of SMTP processing, and blocking or permitting message 

15 processing. For further description of Sendmail's Milter, see, e.g., "Filtering Mail with 
Sendmail" available from Sendmail, Inc. (and currently available via the Internet at 
http://www.sendmail.com/de/partner/resources/development/milter_api/), the disclosure of 
which is hereby incorporated by reference. 

B. Overview of per-domain filtering 

20 As described above, with each new connection a child MTA process is created. In 

accordance with the present invention, each child process connects to the flow control filter 
service, so that it can interact with the service during arrival of a message. This interaction 
provides a complete description of the incoming client, including DP address and host name, 
as well as the complete SMTP interaction, including HELO (i.e., initial "hello" handshake), 

25 MAIL FROM (i.e., sender information), RCPT TO (i.e., recipient Hst), and DATA (i.e., 

entire message body). Since the flow control filter service monitors all children processes, it 
attains a global view of traffic flowing through the system. By virtue of its global view, the 
flow control filter service can track information on a per domain basis, including total 
volume of e-mail received from a particular domain over a given period of time. Examples 

30 of other metrics that may be tracked include total connections and total senders (count) 
encountered for a particular domain over a given period of time. Other examples include 
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total number of different recipients, total number of envelopes, and total aggregate volume of 
mail encountered for a particular domain over a given period of time. Since the knowledge 
lost by the forking process is captured by the flow control filter service, the service is in a 
position to enforce policy-based rules, including placing restrictions on child processes, 
5 based on the per-domain tallies encountered. 

This approach may be easily scaled, for application on a site-wide basis. In that 
instance, the flow control filter service monitors the children processes for a number of e- 
mail servers at a given site. In such a configuration, the flow control filter service would 
apply policy on a global (site) basis, instead of on a per server basis. 

10 C, Architecture modification to provide per-domain filtering 

Fig. 4 is a high-level block diagram illustrating an e-mail environment 400 that 
includes a flow control filter 420 of the present invention. At the highest level, an outside 
Mail Transport Agent (MTA) 401 (e.g., an MTA residing somewhere on the Internet) 
establishes a connection with an internal MTA 41 1 (e.g., a company's internal MTA) to 

15 communicate a new e-mail message. This conomunication will occur via the SMTP protocol. 
Upon receipt of the incoming connection request, the MTA 411 (denoted as the Parent MTA, 
P) creates a first child MTA process 415 (denoted as CI). The task of processing the 
incoming connection is, in effect, handed off to the child MTA process (CI) 415. Further, 
the child MTA process (CI) 415 makes an additional connection to the flow control filter 

20 (service) 420, which has previously registered (to be alerted) through the Sendmail Milter 
interface. The child process reports to the flow control filter the exact nature of the incoming 
connection and accompanying e-mail message. This process continues for other incoming 
connections. For example, as shown, the second MTA process (C2) 417 has been created to 
handle yet another incoming connection. As was the case with the first MTA process (CI) 

25 415, the second MTA process (C2) 417 connects to the flow control filter 420. The process 
may continue for any arbitrary number of incoming connections, as may be supported by a 
given embodiment (e.g., up to hardware resource limits). 

The reporting to the flow control filter 420 occurs at every stage of the connection. 
At the beginning of its connection, for example, the child MTA (CI) 415 sends a message to 

30 the flow control filter 420 indicating that an incoming connection has been requested from a 
particular domain. In response to this message, the flow control filter 420 approves or 
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disapproves the connection request, based on what has transpired already (e.g., number of 
connections from this domain) and the policy/rules 425 currently in force. Similarly, at each 
stage of the SMTP dialog (between the outside MTA 401 and the child MTA (CI) 415), the 
child MTA (CI) 415 posts a message to the flow control filter 420 for indicating the exact 
5 contents of the SMTP dialog. Again, the flow control filter 420 approves or disapproves 
each phase of the SMTP dialog exchange (e.g., "proceed" or "do not proceed"), based on 
what has transpired already and the policy/rules 425 currently in force. In this manner, 
policy can also be applied based on the specific data which comprises a given e-mail 
message. 

10 When the final interaction between a given child MTA (e.g., 415 or 417) and the flow 

control filter 420 has completed, the flow control filter 420 issues a final disposition 
indicating whether the current incoming e-mail message should be accepted or rejected. 
Thereafter, the connection between the child MTA and the flow control filter 420 may be 
terminated. If the incoming connection is dehvering several messages, the connection 

15 between a corresponding child MTA and the flow control filter 420 will persist, so that it 
may be reused until the child MTA terminates the connection (at the conclusion of its 
operation, or in response to instructions from the flow control filter 420). 

D. Flow control filter architecture and basic operation 

Fig. 5 is a block diagram illustrating the flow control filter of the present invention in 
20 further detail. As shown, flow control filter 500 includes a Milter API (application 

programming interface) module 501, callback functions 505, and a flow control engine (core) 
509. The Milter API 501 provides an interface between the flow control filter 500 and child 
MTAs, so that each child MTA may connect with the flow control filter 500. In operation, 
the Milter API 501 registers callback functions 505 with the corresponding Sendmail Milter 
25 service. The callback functions 505 correspond to various phases of the SMTP dialog 

including, for example, a "connect" callback function, a "MAIL FROM" callback function, 
and a "RCPT TO" (i.e., recipient) callback function. In this manner, the flow control filter 
500 may register its own custom handlers (i.e., callback functions 505) with the Sendmail 
Milter service, and do so in a manner that allows the flow control filter 500 to retain session- 
30 specific information (e.g., stored by the callback functions in thread-specific memory). For 
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example, the "connect" callback function will store connection information (e.g., connection 
handle) for the current session. 

By maintaining session-specific information, the flow control filter 500 can track 
information characterizing system operation. For instance, the flow control filter 500 can log 

5 timestamp information pertaining to the occurrence of a particular event of interest, as well 
as maintaining running totals indicating how often that event has occurred. By comparing 
this information against configurable poHcy rules (e.g., maximum number of times that a 
particular event is allowed to occur in a given time period), the flow control filter 500 may 
selectively block particular events (e.g., until a configurable timer has elapsed). Thus, the 

10 tracking of information allows the flow control filter 500 to moderate traffic flow through 
one's e-mail system. 

Basic operation occurs as follows. When a new connection request arrives, the flow 
control filter 500 creates a new connection thread (e.g., "Thread #1") within the flow control 
engine 509. The connect callback function stores relevant connection information ("connect 

15 info") in thread-specific memory (e.g., memory for Thread #1). At the level of the flow 

control engine 509, the incoming information for the current SMTP phase - here, connection 
information - is compared against configurable policy rules. If no rules are violated by 
permitting the connection, then the flow control engine 509 (Thread #1) returns an 
acknowledgment of "OK", whereupon processing may continue to the next SMTP phase. 

20 Otherwise, the flow control engine 509 returns a "Reject" result, whereupon the incoming 
connection is rejected. 

If a "Reject" result did not occur, processing of subsequent SMTP phases continues in 
this manner, with the other corresponding callback functions contacting the flow control 
engine 509 about specific SMTP-phase information. Therefore, after the connection is 

25 approved, processing of the SMTP "MAIL FROM" phase ensues, using the "MAIL FROM" 
callback function. Here, the " MAIL FROM" callback function contacts the flow control 
engine 509 with sender information. In the same manner as before, the flow control engine 
509 either accepts or rejects the information, based on application of policy rules to the 
sender information. Finally, the "RCPT TO" callback function posts the recipient 

30 information to the flow control engine 509. Again, this information is either accepted or 

rejected by the flow control engine 509. In any instance where a rejection occurs, subsequent 
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SMTP phases for the incoming message are blocked. The system may be configured to 
return a specific error code and/or a message (e.g., "AOL.com has exceeded its totals for the 
hour"). If no rejection has occurred, on the other hand, the incoming message will be 
successfully processed. 

5 E. Policy rules 

1. Classes of hosts 

As described above, the flow control filter references a policy/rules configuration file, 
for determining how a particular event is processed. In the currently preferred embodiment, 
the configuration file describes administrator-defined classes of hosts, with each class being 
10 defined by domain. The classes may be assigned arbitrary administrator-supplied class 
names. During the "connect" SMTP phase for a given message, the flow control filter 
attempts to classify the host (domain) for an incoming message into one of the administrator- 
defined host classes. 

Each class may be defined to include events to monitor. Each event monitored is 
15 associated with three class members: current running total, maximum, and timeframe (to 
monitor). Additionally, the class tracks a timestamp indicating when monitoring 
conmienced. During run time, these members may be in an in-memory table upon loading of 
the configuration file, and maintained on an ongoing basis by corresponding threads (from 
the flow control engine) during message processing. Thus, during message processing, 
20 threads from the flow control engine update totals, as appropriate. When a limit is exceeded 
(e.g., in a given time period), a corresponding administrator-defined error message is 
returned and further traffic from that host is blocked (until resetting of the time period). The 
system administrator may configure how firmly the system should reject the message: either 
the message is blocked as a temporary failure (which invites the sending MTA to resend 
25 later), or it is blocked as a permanent failure (which typically generates a "bounce" message). 
The configuration file supports an "aggregation" option. When aggregation is 
enabled for a class, the class data structure applies to all hosts (e.g., all AOL.com hosts). For 
example, if A. AOL.com and B. AOL.com both connect to the system, then the running totals 
for both are applied to the corresponding class data structure. If aggregation is not enabled, 
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the system will create separate instances of each class. Each instance will have an identical 
copy of the specified limits, however, the running totals for each are maintained separately. 

2. Policy configuration 

The configuration file is a text file whose contents follow the well-known Apache 
configuration file format. A class configuration block defines a set of limits for cUents that 
fit in that class. Within each block, the limits are defined, as well as patterns that are 
matched against incoming connections. Each client connection is associated with the first 
class whose definition matches it. That cUent's inbound SMTP traffic is then subject to the 
Umitations defined by that class. 

A Class is defined with the following syntax: 

<Class classname> 
</Class> 

Any number of classes can be defined. A classname is a name unique among class 
definitions and consists of any combination of ASCII letters, numbers, digits or punctuation. 
A Class definition includes the matching specifications and the limit specifications for that 
class. In the currently preferred embodiment, the supported settings and their permitted 
values are as follows: 



Parameter 1 


Value Format j 


Description J 


Host 

1 


hostname 
.domain 

ip-addr 1 
ip-hlock/mask \ 


Specifies which hosts fall within this class. A specific ; 
hostname must match exactly (matching is case-insensitive). 
A domain must match the hostname by being a right- | 
justified substring match in the hostname (matching is case- ' 
insensitive). The IP address must match exactly, or the IP 1 
address of the client must fall within the specified IP block. | 
If is present, all hosts match. Multiple Host entries may | 
be present in any class definition. ^ 


I 

Aggregate ' 


{ True 1 False }; 


If True, the running totals for this class apply across all hosts ^ 
that fall within the class. If False each host within the class 
has individual limits as defined within this class. For ! 
example, if the class applies to all hosts in domain ^ 
example . com and connections arrive from hosts 
a . example . com and b . example . com, then if this feature is j 
set to False, each host is counted as having made one J 
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f 

\ 


* ■■ i 

i 

1 
\ 


connection and the limits defined apply to each host ; 
individually; if this feature is set to True, the entire class is 
counted as having made two connections and the limits 
defined apply to the class-wide totals. The default is False. 


Cascade [ 

\ 


I 
I 

{ True 1 False } j 


If this is True and an arriving connection or message would j 
cause the class or host to exceed its limits, the filter will 
attempt to place the connection into the next available 
matching class whose limits have not yet been exceeded. If 
False, the message will fail as defined by the Response ^ 
setting. The default is False. ^ 


\ 

Connections^ 


j 

lim / time | 


Specifies that no more than lim connections can arrive within ^ 
the specified time. The time is assumed to be in seconds 
unless termmatea oy a leuer sucn as a ^^uays^, n ^^nuuibj, ui ^ 
m (minutes). These can also be combined (e.g. Id6h for one : 
day and six hours) . 


Envelopes 


Urn / time \ 


Specifies that no more than lim distinct messages can arrive 
within the specified time. The time is as defined above. 


Message ^ 

\ 


[SMTP error: [i 
ESC : ]] 
message 


Defines the error message to be returned whenever the 
selected Response is invoked. The message is required. It is ' 
also possible to specify a standard SMTP error code as per ^ 
RFC821, and an Enhanced Status Code (ESC) as defined by 
RrCloyj. it no message is speciriea, wnaiever me ivii/\ s 
default response text message is will be used. J 


Recipients \ 


lim / time | 


Specifies that no more than lim distinct recipients can be 
addressed within the specified time. The time is as defined 
above. j 


\ 

Response ^ 


\ DISCARD 1 
REJECT 1 
TEMPFAIL } ! 

i 


Selects the action to be taken when one or more of the Umits ; 
for the class is exceeded, discard causes the server MTA to I 
retum a success message to the client MTA and silently drop i 
the message; reject causes the server MTA to retum a 
permanent (SMTP 5xx) error message to the client MTA; 
and TEMPFAIL causes the server MTA to retum a temporary i 
[oiyiir ^xxj error mcosage lo liic ciicul iviirv. mc uoiauiL la 

REJECT. 


j Senders 

1 \ 


lim /time 


Specifies that no more than lim distinct senders can be 
addressed within the specified time. The time is as defined 
above. 


1 Volume 


lim /time 


Specifies that no more than lim bytes of message body can 
be sent within the specified time. The time is as defined 
above. The lim can be specified in larger units by ending the 
limit value with a k (for kilobytes), m (for megabytes) or g 
(for gigabytes). 
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The order of the hnes in the configuration file is significant. For example, if an incoming 
connection matches more than one class definition, the matching class definition that appears 
first in the file will be the one selected. 

The absence of any of the limit hnes in a class suggests that that criteria will not be 
5 used to limit the flow for that class. The absence of all of the limit hnes in a class suggests 
that hosts in that class may send mail to the server MTA without bound. The configuration 
file is read when the filter is started, or upon a new connection if the configuration file has 
been updated since it was last read. Whenever the configuration file is reloaded, all current 
counts and timestamps used to maintain totals for each class are discarded. If an incoming 
10 connection applies to no classes defined in the file, the arriving message(s) are delivered 
normally. 

3. Policy configuration example 

The following configuration class definitions, which may be created in a text editor 
by a user, illustrates specification of policy rules. 

15 

<Class Send!nail> 

Host . sendmail . com 
</Class> 

20 <Class AOL> 

Host *. aol.com 
Recipients 1000/ld 
</Class> 

25 <Class Others> 
Aggregate True 
Host * 

Connections 2 50 /Id 
</Class> 

30 

As shown, the script defines three classes: Sendmail, AOL, and Others. In the case of Class 
AOL, for example, the host (domain matching ^MoLcom) is defined to have a maximum 
number of different (i.e., unique) recipients per day (e.g., 1,000). 

F. Summary of overall operation 

35 Figs. 6A-D provide a sunmiary of the overall methodology 600 of the present 

invention. The following method steps occur in the context of an incoming message that is 
being processed by the e-mail system (i.e., MTA forking has already occurred) and now the 
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system is ready to evoke the services of the flow control filter of the present invention. The 
method steps are as follows. 

Invocation of the flow control filter begins with the MTA (i.e., a child MTA of the 
original (parent) listener) connecting to the flow control filter (e.g., using Sendmail Milter 

5 protocol), as shown at step 601; the filter accepts the connection. As shown at step 602, the 
MTA and the filter perform a handshake sequence, including feature and parameter 
negotiation. This allows the two to agree upon what version of the Sendmail Milter protocol 
is to be employed for conmiunication. At the conclusion of the handshake sequence, a new 
thread is created (i.e., in the flow control engine) for processing the new connection/message. 

10 Now, the MTA passes to the filter the corresponding connection information (e.g., IP address 
and host name) of the sending MTA, as indicated at step 603. This information is passed to 
the previously described connection callback function, thus making the connection 
information available to the filter. Thus, based on the connection information, the filter may 
look up matching class data from the configuration file, as shown at step 604. In the event 

15 that no matching class data is found, tested at step 605, the filter will assume unrestricted 
access for the host and therefore will accept the connection and message. In that case, the 
flow control engine thread handling the connection may terminate, as there is no further 
filtering work to be done for this incoming connection and message; the MTA proceeds 
normally with no further interaction with the filter. Otherwise, the method proceeds to the 

20 following filtering steps. 

At step 606, the method tests whether class limits have been reached. In the event 
that limits have not been reached (i.e., "true") at step 606, the filter instructs the MTA to 
continue and increments the current connection count, as indicated at step 607. Otherwise 
(i.e., "false"), the method terminates with the filter rejecting the connection and returning an 

25 administrator-defined error code, as indicated by step 608. In the event that the process did 
not terminate, at step 609, the MTA reports the sender information to the filter; this occurs in 
response to the "MAIL FROM" SMTP phase. At step 610, the method notes the sender (i.e., 
who is the sender) in the class. The administrator-defined class may include, for example, a 
sender-based parameter indicating that the filter should note the number of unique senders 

30 that have arrived in a given timeframe for this particular host (of the class). This is designed 
in particular to catch "spammer" addressers, as they tend to randomize sender addresses. As 
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another example, a sender-based parameter may indicate other maximum limits, such as a 
maximum aggregate byte total or maximum message count acceptable from a given sender 
(e.g., over a specified time period). In a manner similar to above, the method tests whether 
class' sender limits have been reached, as indicated at step 611. In the event that limits have 
5 not been reached (i.e., "true") at step 611, the filter instructs the MTA to continue and 

increments the current unique sender totals, as indicated at step 612. Otherwise, the method 
terminates with the filter rejecting the message (returning any administrator-defined error 
code), as indicated by step 613. 

In the event that the filtering process has not terminated based on sender information, 

10 the method proceeds to test recipient ("RCPT TO") information, at step 614. The 

configuration file allows the administrator to define a class that limits the number of unique 
recipients received for that class, over any given time span. As a given message may have 
multiple recipients, step 614 repeats for each recipient (information) of the message. As 
before, if specified limits are exceeded, the method terminates with the filter rejecting the 

15 message (returning any administrator-defined error code), as indicated by step 615. 
Otherwise, the method updates totals, at step 616, and proceeds. 

At step 617, the MTA reports the message body, which may be transmitted as one or 
more blocks. At step 618, the method updates a running total of message size. This 
information is used to determine the aggregate total of bytes received from a given source 

20 over a period of time. At step 619, the MTA reports "end of message" for the current 

incoming message. Then, at step 620, the method compares the message size against class 
limits specified in the configuration file. Again as before, if specified limits are exceeded, 
the method terminates with the filter rejecting the message (retuming any administrator- 
defined error code), as indicated by step 621. Otherwise, the incoming message has passed 

25 all filters and is accepted, as shown at step 622. Now, the method may repeat for other 

incoming messages. Throughout the operation of the method, class totals are updated. Once 
a given time period has elapsed for a class (i.e., current time exceeds the combination of the 
original "timestamp" plus "time hmit"), totals for the class may be reset (e.g., to zero). 
Appended herewith as Appendix A are source code listings (in the C/C++ 

30 programming language) providing further description of the present invention. A suitable 
C/C++ compiler/Unker environment for compiUng the source Ustings is available from a 
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variety of vendors, including Microsoft Corporation of Redmond, WA and Borland Software 
Corporation (formerly, Inprise Corporation) of Scotts Valley, CA. 

While the invention is described in some detail with specific reference to a single- 
5 preferred embodiment and certain alternatives, there is no intent to limit the invention to that 
particular embodiment or those specific altematives. All told, those skilled in the art will 
appreciate that modifications may be made to the preferred embodiment without departing 
from the teachings of the present invention. 
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APPENDIX A: SOURCE CODE LISTINGS 



** CLASS description of a class and its current state 

The class whose limits and totals are applied will be selected upon 
receipt of a MAIL FROM command. This does mean the applied class 
can change between envelopes within the same connection. 



typedef struct Class * Class; 

struct Class 

{ 



iDOOi 


cls_ 


_aggr ; 


bool 


cls_ 


_cascade ; 


u_long 


cls_ 


_conns ; 


u„long 


cls_ 


_maxconns ; 


u„long 


cls_ 


_envs ; 


u_long 


cls_ 


_maxenvs ; 


u_long 


cls_ 


_maxsenders; 


u„long 


cls_ 


_maxrcpts; 


size_t 


cls_ 


_avol ; 


size_t 


cls_ 


_maxavol; 


t ime_t 


cls_ 


_connstart ; 


t ime_t 


cls_ 


_conntlim; 


time_t 


cls_ 


_envstart ; 


time__t 


cls_ 


„envtlim; 


time_t 


els. 


_avol start; 


time_t 


els. 


„avoltlim; 


time_t 


cls_ 


„sendertlim; 


time_t 


cls_ 


_repttlim; 


char * 


els. 


_message; 


char * 


els. 


.response; 


Vector 


els. 


_hostpat ; 


Vector 


els. 


_senders ; 


Vector 


els. 


„rcpts ; 


smutex_t 


els. 


_lock; 



}; 



/* aggregate? */ 

/* cascade? */ 

/* connection count */ 

/* max, connection count */ 

/* envelope count */ 

/* max. envelope count */ 

/* max. sender count */ 

/* max. recipient count */ 

/* aggregate volume */ 

/* max. aggregate volume */ 

/* conn, count start time */ 

/* conn, count time limit */ 

/* envelope count start time */ 

/* envelope count time limit */ 

/* avol start time */ 

/* avol time time limit */ 

/* sender list time limit */ 

/* recipient list time limit */ 

/* failure message to use */ 

/* failure disposition */ 

/* host patterns which apply */ 

/* vector of senders */ 

/* vector of recipients */ 

/* lock for this record */ 



** MLFI_CONNECT handler for new connections 

■k 

Parameters: 

ctx milter context 
** host -- hostname (possibly NULL?) 

ip — address, in in_addr form (possibly NULL?) 

** Return value: 

** An SMFIS_* constant. 



sf sistat 

mlf i_connect (ctx, host, ip) 
SMFICTX *ctx; 
char *host; 



Page 28 of 43 



_SOCK_ADDR *ip; 

Context pfc; 
time_,t now; 

/* see if the config file was updated; if so, reload and reset 
if (I pf_conf check { ) ) 

return SMFIS_TEMPFAIL; 

/* initialize a new context for this session */ 
pfc = pf_initcontext { ) ; 
if (pfc == NULL) 
{ 

if (dolog) 
{ 

syslog(LOG_ERR, "unable to allocate new context: %m 
return SMFIS_TEMPFAIL; 

} 

} 

/* store the client hostname and IP */ 
if (host != NULL) 

pfc->ctx_clienthost = arena_strdup (pf c->ctx_arena, host); 
if (ip != NULL) 
{ 

pfc->ctx„clientip = arena„malloc (pf c->ctx„arena, 

sizeof (_SOCK_ADDR) ) ; 
memcpy (pf c->ctx_clientip, ip, sizeof (_SOCK„ADDR) ) ; 

} 

/* store it in thread-specific space */ 
smf i_setpriv{ctx, pfc) ; 

/* select the class */ 
pfc->ctx_class = pf_selectclass (pfc) ; 
if (pf c->ctx_class == NULL) 
{ 

if (dolog) 

syslog(LOG_DEBUG, "%s: no classes match; accepting" 
pf c->ctx_msgid) ; 
pf_cleanup (ctx) ; 
return SMFIS_ACCEPT; 

} 

pfc ->ctx_conf read = TRUE; 

/* if we're over the limit, enact the class's disposition */ 

if (pf_atlimit (pf c->ctx_class) ) 

{ 

int ret; 

pf_failmsg(pf c->ctx_class, Scret) ; 

pf_cleanup (ctx) ; 
return ret; 

} 

/* lock the class */ 
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(void) smutex„lock(&pfc~>ctx_class->cls_lock) ; 

/* increment class connection counter */ 
(void) time ( Scnow) ; 
5 if (now > pf c->ctx__class->cls_connstart + 

pf c->ctx„class->cls_conntlim) 

{ 

pf c->ctx_class->cls_conns = 0; 
pfc->ctx_class->cls_connstart = now; 

10 } 

pf c->ctx„class->cls__conns++; 



15 



30 



unlock the class */ 
(void) smutex„unlock(&pfc->ctx_class->cls_lock) ; 

return SMFIS_CONTINUE; 



20 MLFI^BODY handler for an arbitrary body block 

Parameters: 
** ctx milter context 
** bodyp — body block 
25 ** bodylen — amount of data available at bodyp 
■k * 

** Return value: 

An SMFIS_* constant. 



sf sistat 

mlf i_body (ctx, bodyp, bodylen) 
SMFICTX *ctx; 
u_„char *bodyp; 
35 size„t bodylen; 

{ 

Context pfc; 

#ifndef DEBUG 
40 assert (ctx != NULL); 

#endif /* ! DEBUG */ 

assert (bodyp != NULL); 

pfc = (Context) smfi_getpriv(ctx) ; 
45 assert (pfc 1= NULL); 

/* increment size count */ 
pf c->ctx_msgsize += bodylen; 

50 return SMFIS_CONTINUE; 



MLFI_EOM -- handler called at the end of the message; we can now 
55 ** decide based on the configuration what to do with this message, 
update class information, and return the result 
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** Parameters: 

ctx — milter context 

■k -k 

** Return value: 

An SMFIS_* constant. 

*/ 

sf sistat 

mlf i_,eom{ctx) 

SMFICTX *ctx; 

{ 

bool added; 
int c; 
int n; 
int ret; 
time_t now; 
char *hostname; 
ADDRESS a; 
Context pfc; 

#ifndef DEBUG 

assert (ctx 1= NULL); 
#endif /* ! DEBUG */ 

pfc = (Context) smfi_getpriv{ctx) ; 
assert (pfc 1= NULL); 

/* get hostname; used in the X header and in new MIME boundaries 
hostname = smf i_getsymval (ctx, "j"); 
if (hostname == NULL) 

hostname - HOSTUNKNOWN; 

/* grab current time */ 
(void) time(&:now); 

/* lock the class */ 

(void) smutex_lock(&pfc->ctx_class->cls„lock) ; 
/* 

** Increment class counters, resetting any values which have 
** expired. 

*/ 

envelope counter */ 
if (now > pf c->ctx_class->cls_envstart + 
pfc->ctx_class->cls_envtlim) 

{ 

pf c->ctx„class->cls_envs = 0; 

pf c->ctx_class->cls_envstart = now; 

} 

pf c->ctx_class->cls_envs++; 
/* volume counter */ 

if (now > pf c->ctx_class->cls_avolstart + 
pf c->ctx_class~>cls„avoltlim) 

{ 

pfc->ctx_class->cls„avol = 0; 
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pf c->ctx„class->cls_avolstart = now; 

} 

pf c->ctx_class->cls_avol += pf c->ctx_msgsize; 

5 /* sender counter */ 

added - FALSE; 

for (c = 0; c < vector_length {pf c->ctx__class->cls_senders) ; C++) 
{ 

/* get next record */ 
10 a = vector_index(pfc->ctx„class->cls_senders, c) ; 

/* if it's the same sender, update the time and relocate */ 
if {sm__strcasecmp(string_get_data{a->addr_addr) , 
pf c->ctx_sender) == 0) 

15 { 

vector_remove (pf c->ctx_class->cls„senders, c) ; 

a->addr_when = now; 
O vector_append(pfc->ctx_class->cls__senders, a) ; 

^,0 added = TRUE; 

€1 20 break; 

T2 /* if not expired, stop looking */ 

f'^'^ if {a->addr_when + pf c->ctx_class->cls_sendertlim >= now) 

2 25 break; 

^ /* expired; recycle the record 

□ (void) vector_remove {pfc->ctx_class->cls_senders, c) ; 

Co string„blank{a->addr_addr) ; 

yj 30 string_printf (a->add2r_addr^ pfc->ctx__sender) ; 

a->addr_when = now; 

vector_append(pfc->ctx_class->cls„senders, a) ; 
tl added = TRUE; 



35 



} 



/* couldn't recycle an old record, so make a new one */ 

if (1 added) 

{ 

a = arena_zalloc (main„arena, sizeof (struct address)); 
40 a->addr_addr = string_new(main_arena, pf c->ctx_sender) ; 

a->addr_when = now; 

vector_append(pfc->ctx„class->cls_senders, a) ; 

} 

45 /* recipient counter */ 

for (n = 0; n < vector_length (pf c->ctx_rcpts) ; n++) 
{ 

added = FALSE; 

for (c = 0; c < vector„length (pf c->ctx„class->cls_rcpts ) ; C++) 

50 { 

/* get next record */ 

a = vector_index (pf c->ctx_class->cls_rcpts , c) ; 

/* if it's the same recipient, update and relocate */ 
55 if (sm_strcasecmp(string_get_data(a->addr_addr) , 

vector_index(pfc->ctx_rcpts, n) ) == 0) 

{ 
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vector_remove (pfc->ctx_class->cls_rcpts, c) ; 
a->addr_when = now; 

vector_append(pfc->ctx_class->cls_rcpts, a) ; 
added = TRUE; 
break ; 

} 

/* if not expired, stop looking */ 
if (a->addr_when + pf c->ctx_class->cls_sendertlim 
>= now) 
break; 

/* expired; recycle the record */ 

(void) vector_remove{pfc->ctx_class->cls_senders, c) ; 
string_blank (a->addr_addr) ; 

string_printf (a->addr_addr, pf c->ctx_sender ) ; 
a->addr_when = now; 

vector_append{pfc->ctx_class->cls_senders, a) ; 
added = TRUE; 

} 

/* couldn't recycle an old record, so make a new one */ 

if (i added) 

{ 

a = arena_zalloc (main_arena, sizeof (struct address)); 
a->addr_addr = string„new {main_arena, pf c->ctx_sender) 
a->addr_when = now; 

vector_append{pf c->ctx_.class->cls_senders, a) ; 

} 



} 



/* unlock the class */ 

(void) smutex_unlock(&pfc->ctx_class->cls_lock) ; 

/* identify the filter, if requested */ 

if (addxhdr) 

{ 

String xfhdr; 

xfhdr = string_alloc (pf c->ctx„arena, 0); 

string_printf (xfhdr, "%s v%s %s %s", PF_PRODUCT, PF_VERSION, 
hostname^ 

pfc->ctx_msgid 1= NULL ? pf c->ctx_msgid 

: MSGIDUNKNOWN) ; 

if (sinfi_addheader(ctx, XHEADERNAME, 

string_get_data (xfhdr) ) != MI_SUCCESS) 



{ 



if (dolog) 

syslog(LOG_ERR, " smf i_addheader () failed") ; 

pf_cleanup (ctx) ; 
return SMFIS„TEMPFAIL; 



} 



} 

/* Implement disposition */ 
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ret = SMFIS_ACCEPT; 

if (pf_atlimit (pf c->ctx_class) ) 

pf_f ailmsg {pf c->ctx_class, Scret) ; 

if {ret == SMFIS„ACCEPT dolog) 

syslog{LOG_INFO, "%s accepted", pf c->ctx„msgid) 

/* s'all good 
pf_cleanup (ctx) ; 
return ret; 
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